智能论文笔记

RandomSCM: interpretable ensembles of sparse classifiers tailored for omics data

Thibaud Godon , Pier-Luc Plante , Baptiste Bauvin , Elina Francovic-Fontaine , Alexandre Drouin , Jacques Corbeil

分类：机器学习

2022-08-11

背景：了解OMICS与表型之间的关系是精确医学中的一个核心问题。代谢组学数据的高维度挑战学习算法在可伸缩性和概括方面。大多数学习算法都不产生可解释的模型 - 方法：我们根据决策规则的结合或分离提出了一种集合学习算法。 - 结果：代谢组学数据的应用显示，它会产生可实现高预测性能的模型。模型的解释性使它们可用于生物标志物发现和高维数据中的模式发现。

translated by 谷歌翻译

TACTiS: Transformer-Attentional Copulas for Time Series

Alexandre Drouin , Étienne Marcotte , Nicolas Chapados

分类：机器学习 | (统计)机器学习

2022-02-07

时间变化数量的估计是医疗保健和金融等领域决策的基本组成部分。但是，此类估计值的实际实用性受到它们量化预测不确定性的准确程度的限制。在这项工作中，我们解决了估计高维多元时间序列的联合预测分布的问题。我们提出了一种基于变压器体系结构的多功能方法，该方法使用基于注意力的解码器估算关节分布，该解码器可被学会模仿非参数Copulas的性质。最终的模型具有多种理想的属性：它可以扩展到数百个时间序列，支持预测和插值，可以处理不规则和不均匀的采样数据，并且可以在训练过程中无缝地适应丢失的数据。我们从经验上证明了这些属性，并表明我们的模型在多个现实世界数据集上产生了最新的预测。

translated by 谷歌翻译

Toward Foundation Models for Earth Monitoring: Proposal for a Climate Change Benchmark

Alexandre Lacoste , Evan David Sherwin , Hannah Kerner , Hamed Alemohammad , Björn Lütjens , Jeremy Irvin , David Dao , Alex Chang , Mehmet Gunturkun , Alexandre Drouin

分类：机器学习

2021-12-01

最近的自我监督进展表明，预先训练大量无监督数据的大型神经网络可能导致下游任务的概括令人印象深刻。这些模型最近被作为基础模型，一直转变为自然语言处理领域。虽然类似的模型也在大型图像的核心训练中，但它们不适合遥感数据。为刺激地球监测基础模型的发展，我们建议开发由与气候变化相关的各种下游任务组成的新基准。我们认为，这可能导致许多现有应用程序的大量改进，并促进新应用的发展。该提案还可以提出合作，并提出更好的评估过程，以减轻地球监测的基础模型的潜在缺陷。

translated by 谷歌翻译

3DSGrasp: 3D Shape-Completion for Robotic Grasp

Seyed S. Mohammadi , Nuno F. Duarte , Dimitris Dimou , Yiming Wang , Matteo Taiana , Pietro Morerio , Atabak Dehban , Plinio Moreno , Alexandre Bernardino , Alessio Del Bue

分类：机器人 | 人工智能

2023-01-02

Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.

translated by 谷歌翻译

Robust Bayesian Subspace Identification for Small Data Sets

Alexandre Rodrigues Mesquita

分类： (统计)机器学习

2022-12-29

Model estimates obtained from traditional subspace identification methods may be subject to significant variance. This elevated variance is aggravated in the cases of large models or of a limited sample size. Common solutions to reduce the effect of variance are regularized estimators, shrinkage estimators and Bayesian estimation. In the current work we investigate the latter two solutions, which have not yet been applied to subspace identification. Our experimental results show that our proposed estimators may reduce the estimation risk up to $40\%$ of that of traditional subspace methods.

translated by 谷歌翻译

Automatic Text Simplification of News Articles in the Context of Public Broadcasting

Diego Maupomé , Fanny Rancourt , Thomas Soulas , Alexandre Lachance , Marie-Jean Meurs , Desislava Aleksandrova , Olivier Brochu Dufour , Igor Pontes , Rémi Cardon , Michel Simard

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-26

This report summarizes the work carried out by the authors during the Twelfth Montreal Industrial Problem Solving Workshop, held at Universit\'e de Montr\'eal in August 2022. The team tackled a problem submitted by CBC/Radio-Canada on the theme of Automatic Text Simplification (ATS).

translated by 谷歌翻译

VCNet: A self-explaining model for realistic counterfactual generation

Victor Guyomard , Françoise Fessant , Thomas Guyet , Tassadit Bouadi , Alexandre Termier

分类：人工智能 | 机器学习

2022-12-21

Counterfactual explanation is a common class of methods to make local explanations of machine learning decisions. For a given instance, these methods aim to find the smallest modification of feature values that changes the predicted decision made by a machine learning model. One of the challenges of counterfactual explanation is the efficient generation of realistic counterfactuals. To address this challenge, we propose VCNet-Variational Counter Net-a model architecture that combines a predictor and a counterfactual generator that are jointly trained, for regression or classification tasks. VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem. Our contribution is the generation of counterfactuals that are close to the distribution of the predicted class. This is done by learning a variational autoencoder conditionally to the output of the predictor in a join-training fashion. We present an empirical evaluation on tabular datasets and across several interpretability metrics. The results are competitive with the state-of-the-art method.

translated by 谷歌翻译

Recycling diverse models for out-of-distribution generalization

Alexandre Ramé , Kartik Ahuja , Jianyu Zhang , Matthieu Cord , Léon Bottou , David Lopez-Paz

分类：机器学习 | 人工智能 | 计算机视觉

2022-12-20

Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions: download a copy of a foundation model, and fine-tune it using some in-house data about the target task of interest. Consequently, the Internet is swarmed by a handful of foundation models fine-tuned on many diverse tasks. Yet, these individual fine-tunings often lack strong generalization and exist in isolation without benefiting from each other. In our opinion, this is a missed opportunity, as these specialized models contain diverse features. Based on this insight, we propose model recycling, a simple strategy that leverages multiple fine-tunings of the same foundation model on diverse auxiliary tasks, and repurposes them as rich and diverse initializations for the target task. Specifically, model recycling fine-tunes in parallel each specialized model on the target task, and then averages the weights of all target fine-tunings into a final model. Empirically, we show that model recycling maximizes model diversity by benefiting from diverse auxiliary tasks, and achieves a new state of the art on the reference DomainBed benchmark for out-of-distribution generalization. Looking forward, model recycling is a contribution to the emerging paradigm of updatable machine learning where, akin to open-source software development, the community collaborates to incrementally and reliably update machine learning models.

translated by 谷歌翻译

Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model

Yeskendir Koishekenov , Vassilina Nikoulina , Alexandre Berard

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-19

Compared to conventional bilingual translation systems, massively multilingual machine translation is appealing because a single model can translate into multiple languages and benefit from knowledge transfer for low resource languages. On the other hand, massively multilingual models suffer from the curse of multilinguality, unless scaling their size massively, which increases their training and inference costs. Sparse Mixture-of-Experts models are a way to drastically increase model capacity without the need for a proportional amount of computing. The recently released NLLB-200 is an example of such a model. It covers 202 languages but requires at least four 32GB GPUs just for inference. In this work, we propose a pruning method that allows the removal of up to 80\% of experts with a negligible loss in translation quality, which makes it feasible to run the model on a single 32GB GPU. Further analysis suggests that our pruning metrics allow to identify language-specific experts and prune non-relevant experts for a given language pair.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译